Normalized Difference Water Index for Cultural Heritage

A Reproducible Method for Monitoring Flooded Archaeological Sites

Andrea Titolo

University of Turin

Dams and Western Asian Archaeology


Marchetti et al. 2019

Location of active dams (above 15 m) and their reservoirs (GRanD data)

Second-phase salvage approach

  • How can we identify which sites have been affected by water level fluctuations?

  • What are the effects of prolonged submersion of these sites?

  • How can we document, monitor, safeguard and plan rescue operations?

Location of active dams (above 15 m) and their reservoirs (GRanD data)

Talk overview

  1. Workflow description
  2. How does it work?
    • Satellite image selection
    • NDWI Time-Series processing
    • Image processing
    • Calculation of the percentage of surfaced area
    • Technical limitations
  3. Results, Reproducibility, and open-source
  4. Conclusions and future perspectives

Introduction

1 2 3 4

Manual inspection of the Haditha Dam

1 2 3 4

  • Remote sensing inspection of the Haditha Dam reservoir
  • Bing vs Google Earth/Sentinel images showed partial emersion of archaeological sites

First workflow iteration

1 2 3 4

  • Titolo 2021
  • Case studies: Haditha Dam, Mosul Dam, Hamrin Dam
  • Analyses with one image composite per year
  • Partial experimentation with monthly monitoring

First workflow iteration

1 2 3 4

  • At the time, the approach would have greatly benefit from:
    • More robust archaeological dataset (i.e. more site areas)
    • An approach focused more on monthly inspections

Second workflow iteration

1 2 3 4

  • Long-term inspection of the Mosul Dam reservoir
  • Monthly image composites

Sconzo-Simi-Titolo 2023 “Drowned Landscapes: The Rediscovered Archaeological Heritage of the Mosul Dam Reservoir.” BASOR (Ahead of Print). doi:10.1086/724419

How does it works?

1 2 3 4

Issues to tackle

1 2 3 4

The process required to answer the research questions is theoretically simple:

  • identify water on satellite images
  • the identifications relies on the way the water absorbs the sun light at difference wavelengths and frequencies of the electromagnetic spectrum


Conceptual model of the physical processes involved in the remote sensing process when working on water surfaces (source).

Alternatives available for water extraction

1 2 3 4

Density Slicing

  • Used e.g. by Marchetti et al. (2019)
  • Uses only one band (NIR, better results than visible bands)


  • Problems
    • risks introducing errors due to the presence of shadows (reliefs and urban areas).

Tasselated Cap Transformation

  • System for enhancing spectral band information
  • Compresses all information into three bands: greeness, brightness, wetness


  • Problems
    • Overestimates the presence of water, does not eliminate shadows.

Spectral indices

1 2 3 4

  • Combinations of several bands (usually two).
  • Spectral indices are combinations of spectral reflectance (percentage of reflected radiation) from two or more wavelengths indicating the relative abundance of features of interest.
  • More robust results allow the elimination of intrusive features

\(Index= \frac{Band1 - Band2}{Band1 + Band2}\)

Examples of spectral indices (source: geodose).

NDWI

1 2 3 4

  • Normalized Difference Water Index (NDWI)
  • Two formulas: McFeeters (1996) e Xu (2006) (MNDWI)

\(NDWI_{McFeeters}= \frac{Green - NIR}{Green + NIR}\)


\(NDWI_{xu}= \frac{Green - SWIR}{Green + SWIR}\)


Result

  • Interpretation of values based on an arbitrary threshold
  • if the pixel value in the image is greater than 0 = water
  • if the value is < 0 = non-water

Example of NDWI image generated by Green and SWIR bands

MNDWI

1 2 3 4

  • Normalized Difference Water Index (NDWI)
  • Two formulas: McFeeters (1996) e Xu (2006) (MNDWI)

\(NDWI_{McFeeters}= \frac{Green - NIR}{Green + NIR}\)


\(NDWI_{xu}= \frac{Green - SWIR}{Green + SWIR}\)


Result

  • Interpretation of values based on an arbitrary threshold
  • if the pixel value in the image is greater than 0 = water
  • if the value is < 0 = non-water

Example of NDWI image generated by Green and SWIR bands

MDNWI

1 2 3 4

\(NDWI_{xu}= \frac{Green - SWIR}{Green + SWIR}\)


Advantages

  • Better discrimination of urban areas near water thanks to the short-wave infrared (SWIR) band.
  • SWIR works better than NIR when there is no dense vegetation around water.

Example of NDWI image generated by Green and SWIR bands

Elaboration

1 2 3 4

  • Selection of satellite images1
  • NDWI Time-Series Processing
  • Image processing
  • Calculation of the percentage of surfaced area

Choice of satellite images

1 2 3 4

Satellite Images for the Mosul Dam Area
Satellite Years Spat. Res. Temp. Res. Tot. Img
Landsat 5 (TM) 1993-2001; 2004-2011 30m 16 days 34
Landsat 7 (ETM+) 2002 30m 16 days 2
Landsat 8 OLI 2014-2016 30m 16 days 6
Sentinel-2 (MSI) 2017-2020 20m 5 days 8


Satellite Images for the Mosul Dam Area
Satellite Years Spat. Res. Temp. Res. Tot. Img
Landsat 5 (TM) 2000-2011 30m 16 days 24
Landsat 8 OLI 2014-2016 30m 16 days 10
Sentinel-2 (MSI) 2017-2020 20m 5 days 12

DAHITI

1 2 3 4

  • Database for Hydrological Time Series of Inland Waters (DAHITI) (Schwatke et al. 2015).
    • Provides comparison data for maximum and minimum water level values
    • Allows the analysis result to be associated with a precise value
    • Provided a tool to limit the analysis time window
  • Objective: to reconstruct the emersion and submergence history of archaeological sites
  • Periods of maximum (h.w.l.) and minimum (l.w.l.) on which to focus the analysis

Image processing - GEE

1 2 3 4

  • Cloud computing platform for geospatial data launched in 2017 (Gorelick et al. 2017).
  • It combines a gigantic (petabytes) catalogue of satellite images and geospatial data.
  • It allows complex algorithms to be used very quickly.
  • GEE still requires knowledge of Javascript to be exploited to the full.

Image processing - GEE

1 2 3 4

  • Monthly Composites
    • Created by averaging the NDWI values for each pixel of each image within the monthly interval.
    • Helps reduce the presence of shadows caused by clouds or relief.
    • Two months per year filtered using DAHITI data.


Date l.w.l. Date h.w.l. Max. Water Level Min Water Level
2002-11 2002-07 314,443 327,871
2007-09 2007-07 313,142 318,879
2018-01 2018-06 292,862 311,833

Post-processing

1 2 3 4

  • Satellite image selection
  • NDWI Time-Series Processing
  • Image processing
  • Calculating the percentage of emerged surface area
  • Results

Reclassification

1 2 3 4

  • To calculate the temporal change within the site area, it is necessary to have unique values to quantify.
  • Reclassification in QGIS


\(Land < 0 > Water\)

Example of reclassification result

Zonal Histogram

1 2 3 4

  • QGIS Plugin (ArcGIS version with the same name)

  • Counts the number of unique pixels inside each polygon area

  • Output: Shapefile with the counts as column in the attribute table

  • Limitations:

    • Spatial resolution of input satellite images.
    • Reliability of the location and extent of archaeological sites.

Graphical representation of the Zonal Histogram algorithm (source: QGIS Documentation)

NDWI images and polygonal element representing the Jubaniyah area and the target of the zonal histogram algorithm

Technical constraints

1 2 3 4

Threshold

1 2 3 4

  • Despite the choice of formula and bands best suited to our case study, there was still some imperfection in the resulting images, mainly caused by shadows.
  • Threshold experimentation is manual and arbitrary (even the initial 0 is a convention).
    • A threshold of 0.05 eliminates much of the background noise without sensitively altering the results

\[Land < 0 > Water\] \[Land < 0.05 > Water\]

Obviously we need a qualitative measurement of the capabilities and results of our choices

Threshold

1 2 3 4

Example of reclassification with threshold 0

Example of reclassification with threshold 0.05

Accuracy assessment

1 2 3 4

  • The confusion matrix serves to provide a qualitative measure of how accurate our reclassification is
    • it often happens that due to various factors such as the resolution of the image used, the nature of the soil, etc., a pixel is recorded as false positive

Fisher (1997)

Deer et al. (1996)

Remón et al. (2013)

Accuracy assessment

1 2 3 4

  • Difficult to perform for long-term change-detection analysis
    • (Often) requires using the same data that has been reclassified (not optimal)
    • Time-consuming (must be performed manually)
    • Even if random sampling is performed, risk of overestimating the class most represented in the image.
  • Used: Images with same date but pansharpened and true-colour

Accuracy assessment

1 2 3 4

Confusion Matrix for the Mosul Dam Dataset
Satellite Images Overall Accuracy Producer’s Accuracy User’s Accuracy
Landsat 5 98.01% 98.63% 94.74%
Landsat 7 98.59% 100% 95.22%
Landsat 8 98.36% 99.24% 95.08%
Sentinel-2 98.36% 99.24% 94.99%


Confusion Matrix for the Tishreen Dam Dataset
Satellite Images Overall Accuracy Producer’s Accuracy User’s Accuracy
Landsat 5 98.74% 98.57% 97.18%
Landsat 8 98.31% 98.48% 95.59%
Sentinel-2 99.16% 98.57% 98.57%

Others

1 2 3 4

  • A robust archaeological dataset is advised
  • Presence of clouds
    • impact availability of satellite images
  • Sensible to NA values
  • No actual information on the accessibility of sites or their state of preservation

Results

1 2 3 4

  • Satellite image selection
  • NDWI Time-Series Processing
  • Image processing
  • Calculation of the percentage of emerged area
  • Results

General picture - Quantification

1 2 3 4

  • Sites that during the entire study period (1993-2020) were always submerged (never exposed, in black)
  • Sites that were always exposed (never submerged, in grey)
  • Sites that were affected by water level fluctuation during the study period (cyclically affected), further subdivided into:
    • Sites always submerged at h.w.l. of the year (always submerged at h.w.l., in red)
    • Cyclically surfaced sites (affected, in yellow)
    • Sites that have always re-emerged at the l.w.l. of the year (always exposed at the l.w.l., in green)

Spatial distribution

1 2 3 4

  • 53 sites always submerged at the h.w.l. of the year, in red
    • They are located at the bottom of the reservoir, near the dam itself. Most are mound settlements. Many were excavated during recovery activities prior to the flooding of the reservoir
  • 29 cyclically emerged sites (affected, in yellow)
    • They are located at the edge of the reservoir (to the SE and extreme NW). If the lake level rises above the maximum so far, they are the first to be affected and submerged. Recorded by modern surveys, i.e. EHAS and LoNAP. Almost half of them are mound settlements, followed by flat sites and structures.
  • 193 sites that always surfaced at the l.w.l. of the year (always exposed to the l.w.l., in green).
    • Different extents at the two extremes of minimum and maximum water level (C and D). Mostly flat settlements, structures, cemeteries.

Annual inspection of time-series data

1 2 3 4

The number of exposed sites peaked in 2010-2011 and 2015-2018, when more than 75 per cent of the sites resurfaced.

Annual inspection of time-series data

1 2 3 4

During years with similar water levels a similar number of sites emerge - Very useful for prevention and planning activities.

Annual inspection of time-series data

1 2 3 4

  • Sudden changes in the water level of the lake in the same year (even for only a few months)
  • Cause several sites to emerge from the waters


Individual sites resurfaced area

1 2 3 4

  • sites that were unaffected or only briefly submerged during episodes of maximum water level
  • sites that re-emerged only during periods of severe lake volume reduction
  • sites that mostly resurfaced during periods of l.w.l.

Reconstructing sites history

1 2 3 4

The high quality dataset and long-term analysis allow individual sites to be investigated if necessary

Percentage of area emerged during the years under analysis for the Jubaniyeh site

Reproducibilty and open-source

1 2 3 4

“Tools that do not facilitate well-documented, transparent, portable and reproducible data analysis workflows may, at best, result in irreproducible, unextendable research that does little to advance the discipline” (Marwick 2017).

Reproducibility

1 2 3 4

  • Free and globally available datasets
  • QGIS models to reproduce exact steps
  • Links to GEE code that can be shared and executed
  • All steps performed in GEE and QGIS were reproduced in R via two main libraries

Reproducibility

1 2 3 4

  • All steps performed in GEE and QGIS were reproduced in R via two main libraries
  yearmap <- function(m){
    start <- ee$Date(m)
    end <- ee$Date(m)$advance(1,"month") # take one image per month
    date_range <- ee$DateRange(start,end) # until the end of the specified period
    name <- area$cat(start$format("YYYY-MM"))$cat(sensor) # create image (file) name
    ImgYear <- BaseColl$
      filterDate(date_range)$ # filter for the selected date above
      filterBounds(study_region)$ # only inside the study region
      filter(cloud_filter)$ # Pre-filter to get less cloudy granules.
      map(cloudfunction)$ # apply the cloudmasking function
      map(getNDWI)$ # apply the function and get the NDWI band
      map(function(img){return(img$clip(study_region))}) # clip the images on the study region
    return(ImgYear$median()$set("name",name)) # take the median
  }
var S2monthmap = function(m){
  var start = ee.Date(m);
  var end = ee.Date(m).advance(1,'month'); // take one image per month
  var date_range = ee.DateRange(start,end); // until the end of the specified period
  var name = area.cat(start.format('YYYY-MM')).cat(sensor); // create image (file) name
  var ImgMonth = BaseColl
    .filterDate(date_range) // filter for the selected date above
    .filterBounds(study_region) // only inside the study region
    .filter(cloud_filter) // pre-filter to get less cloudy granules.
    .map(cloudfunction) // apply the cloudmasking function
    .map(getNDWI) // apply the function and get the NDWI band
    .map(function(img){return img.clip(study_region)}); // clip the images on the study region
  return(ImgMonth.median().set({name: name})); // take the median
};

Sharing and archival

1 2 3 4

  • Online code sharing (github “organisation”)
  • Using version control while working (git)
  • Archiving code with unique identifiers e.g. DOI (zenodo)

Ease of use

1 2 3 4

  • Only a few starting options to tweak
  • Conditionals and functions to take carer of different options
  • Commented code
  • Platform-agnostic (Windows/Mac/Linux through R/GEE/QGIS)

Conclusions

1 2 3 4

Conclusions

1 2 3 4

Advantages

  • Relies on free data and softwares
  • Reproducible
  • Multiplatform
  • Potentially applicable to different regions (next case study -> Tishreen Dam)

Disadvantages

  • It still require some manual work and knowledge
  • Accuracy assessment
  • Softwares to install

Conclusions

1 2 3 4

  • Web-apps
  • Shiny Apps (R)
  • Google Earth Engine (Javascript)

Thank you!

Andrea Titolo (andrea.titolo@unito.it)


Link to presentation


Works Cited

Deer, P. J., Eklund, P. W. and Norman, B. D. (1996). A Mahalanobis distance fuzzy classifier. 1996 Australian New Zealand Conference on Intelligent Information Systems. Proceedings. ANZIIS 96.
Fisher, P. (1997). The pixel: A snare and a delusion. International Journal of Remote Sensing 18: 679–685.
Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D. and Moore, R. (2017). Google Earth Engine: Planetary-scale Geospatial Analysis for Everyone. Remote Sensing of Environment 202: 18–27.
Marchetti, N., Curci, A., Gatto, M. C., Nicolini, S., Mühl, S. and Zaina, F. (2019). A Multi-Scalar Approach for Assessing the Impact of Dams on the Cultural Heritage in the Middle East and North Africa. Journal of Cultural Heritage 37: 17–28.
Marwick, B. (2017). Computational Reproducibility in Archaeological Research: Basic Principles and a Case Study of Their Implementation. Journal of Archaeological Method and Theory 24: 424–450.
McFeeters, S. K. (1996). The Use of the Normalized Difference Water Index (NDWI) in the Delineation of Open Water Features. International Journal of Remote Sensing 17: 1425–1432.
Remón, A., Sánchez, S., Bernabé, S., Quintana-Ortı́, E. S. and Plaza, A. (2013). Performance versus energy consumption of hyperspectral unmixing algorithms on multi-core platforms. EURASIP Journal on Advances in Signal Processing 2013:
Schwatke, C., Dettmering, D., Bosch, W. and Seitz, F. (2015). DAHITI an Innovative Approach for Estimating Water Level Time Series over Inland Waters Using Multi-Mission Satellite Altimetry. Hydrology and Earth System Sciences 19: 4345–4364.
Titolo, A. (2021). Use of Time-Series NDWI to Monitor Emerging Archaeological Sites: Case Studies from Iraqi Artificial Reservoirs. Remote Sensing 13: 786.
Xu, H. (2006). Modification of Normalised Difference Water Index (NDWI) to Enhance Open Water Features in Remotely Sensed Imagery. International Journal of Remote Sensing 27: 3025–3033.